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equipment. Of major interest were the computer selection, 
У Processor selection and basic signal рост ЕО 


ENTE real-time utilization, 
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ABBREVIATIONS 


А! хх AP-120B Adder Register One 

AQ x*x* AP-120B Adder Reaister Two 

pee xx Analog to Digital 

ADAM xx Analog Data Acquistion Module 
ERU tk Arithmetic Logical Unit 

ANSI ** American National Standards Institute 
meee MAP=500 Arithmetic Processor 
АРСЕТ хх Get Data From the AP-1208B 
ШЕГІ хх Рос Data Into the AP-1208 
EX MAP-3500 Arithmetic Processor Опе 
МЕС хх: MAP=300 Arithmetic Processor Two 


Bel хх Floating Point Systems Array Processor 


Model 120-5 

APAL xx AP-1208 Cross-Ässembler 

СЕХИ APR=120B Executive Routine 

АРОМА хх AP-120B AP Direct Memory Address Register 


APLINK «* AP-120B Linker and Loader 





APMATH xx AP-1208 Math Library 


АРМАЕ хх AP-120B Memory Address Extension 


EESIM *x AP-1208 Simulator 


EN D x* APETI2O0B Path Tester Proaram 


APS *x MP-300 Addresser Processor Section 


APU *x MP-500 Arithmetic Processing Unit 


MUL zx Complex Vector Multiply 


KE Central Processing Unit 


CSPU x* MAP-500 Central System Processing Unit 


CSw хх MAP-300 Control! StotuseWMNegister “ore Ce5tate 


Word 


mu хх AP-120B Control Register 


ШЕЗ MAP-9300 Driver Control Block 


sr Direct Inout/Üutrput 


ОМА xx Direct Memory Access 


DPA xx AP-120B8 Data Pad Address Register 


DPX xx AP-1208 Data Раа х 


DPY xx AP-120B Data Раа Y 


FA xx AP-120B Adder Result Register 





mee *x MAP-500 Function Control! Block 


ШЕП Хх fast Fourier Transform 


mer) хх FPorwarad/iInverse Fast Fourier Transform Test 


ENDO ** First In First Out 


ЕГІ ** AP-120B Adder Results Less Than Zero 


mere AP=—1c08 Multiclier Result Register 


FMT xx AP=120B Format Register 


КОШУ» AP-120B Function Register 


FO ** AP-120B Adder Exponent Overflow 


FU *x* AP-120B8 Adder Exponent Underflow 


FZ xx AP-120B Adder Hesults Eaual Zero 


НМА xx AP=120B Host Memory Access Register 


НІС *x MAP-300 Host Interface Controller 


НІМ жх MAP-300 Host Interface Moaule 


miro ** MAP-500 Host Interface Scroll 


ШОО МАР=500 Input/Output Scroll 


Ш хх МАР-9500 Input Queue 


агасы In First Qut 





Mimbo ** AP-1208 Lichts Register 


PARA AP=120B Multipljier Unit Number One 


NENA A AP=1206 Multiolier Unit Number Two 


MAP xz Macro Array Processor 


MAP-300 ** CSPI Macro Array Processor Model 300 


mex * AP=120B Main Data Memory Output Buffer Register 


MI ** AP-120B Main Data Memory Input Buffer 


MOS «x Metalic Oxide Semiconauctor 


MTGF ** Mean Time Between Failure 


MTIR xx Mean Time To Repair 


МОР xx No Operation 


OG ** MAP-300 Outout Queue 


EU x*x MAP=300 Program Counters Une Through Three 


P xx MAP-300 Multiolier Results Register 


ОЕ х AP=1208 Programmable Input/Outout Channel 


AR CAP=120B Programmable Input/Output Processor 


H ** MAP-500 Adder Results Register 


RAF żx MAP-300 Read Address FIFO 
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KAMP xe Reliability And Maintenance Proaram 


ШЕРТ KX Real to Complex FFT 


RFFTSC ax Real FFT Scale and Format 


ROM «x Kesd-Ünly Memory 


Ш Рас хх АР-(І2ОВ Scratch Pad 


EI] ** MAP-500 Systematic Notation For Array 


Processing Version II 


SPFN ях AP-120B S-Pad Output Buffer Register 


SRA xz Subroutine Return Address 


КОК <«* AP-1208 Switch Register 


SYSFLG хх MAP**500 System Flaa Register 


ІМ яж APx*120H Table Memory 


IMA ** AP-1208 Table Memory Address Register 


TMRAM xx AP-120B Random Access Table Memory 


ИСЕ Volts Alternating Current 


UL хх» Vector Multiply 


МАЕ хх MAPxx300 Write Address FIFO 


WC ** AP=-1208 Word Count Register 





Poe See ODUCT IGN 


SBOE O оты 5 StUdy 15 to begin evaluation of 3 
EN Osed ѕтапаї -огосеѕѕіпа test bed similiar to the test bed 
MG installed at the Naval Postaraduate School, Monterey, 
fornia. The oasic test bed consists Of an analog 
system (fig 1), data-processing subsystem (fig 2), 
 Ппа|і-сгосевзіпа subsystem (fig 3) ana display subsystem 


mmo 4) to be used for general-purpose Naval research. 


The analog subsystem of the test bed was designed for 
signal (сс со aNd conditioning.: This is basically 
РЕС ПІРІ 15һес бу а 126-1 пе input into a orogrammed matrix 
ENUCh which emits 32 lines of outout. These 32 lines 
memermue through a oroaram-controlled filter issuing output 


from the subsysten, 


The signal-processing suposvSstem receives results from 
Ehe analoa subsystem via an AN-5400 A/D converter. This 
Enformation can then oe stored in an Апоех Megastore unit to 
Sem later processed by one MAP-300 array processor. A 
ПИСЕ 154 computer controls the mass storage device, the 
ШИ o rocessor ana input functions. Output is directed to 


the data-processing subsystem. 


The data-processing subsystem receives the processed 


data and controls the operation of the display subsystem. 
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Display devices presently include а Ramtek 9500 Video 


ОТ зу Unit (color and shades of gray), the Versatec 1600А 


Iorinter/plotter and an EPC 2300 Gram Writer. 


The goal of this study was to examine the major system 
components, computers, array processors and major data paths 
to determine feasibility for various uses апа sugaest 
possible alternative methods, especially in the real=time 
environment. The basic task of the test bed was assumed to 
be general witn no suagestion of specific tasks although it 
was recognized that many uses and data rates may be 


utilized. 


Chapter II discusses specific computer manufacturers 
ШЕ computer types. Chapters III, IV and V deal with the 
two most popular general-puroose array processors on the 
market, discussing the pros and cons of each. Chapter VI 
gives final conclusions ana recommendations Concerning the 


Proposed test bed. 





Pie “COMPUTERS 


IE GENERAL 


aor the test bed evaluation, сооб па st ne proper 
computer 1$ Ineortankt since a varying amount of 
computational power 1S required for each subsystem. Also, a 
[Gambit of functions and uses may be tried necessitating a 
system that must realistically emulate many speed, cost and 
memory constraints. A common and pooular system affords 
Better software support while still maintaining a low price. 
ability to rely on system Support is an important issue 
Eien considering long term use. А popular system tends to 
develoo newer, more efficient software packages earlier and 


more rrequentliy than co less used systems. 


Marce array processing applications with many 
display devices the ideal situation would be for one 
Memeuter to initially load the array processor and then act 
as a "whole system" monitor and statistician. It could also 
Derform the information aathering function while another 
ШОО ег would act as the output processor for the array 
mmgGcessor and control the oisplay devices. That sitúation 
Ша be similiar to that of a test bed where flexibility 
may be the key and being comouter-bound would be hiahly 
undesirable and oossibly unjustly influence the evaluation 
Ш (Пе array processor. An ultimate goal might to be to 


choose Mes ma llest computer capable of operating the array 





processor and associated cisplay devices in the desired 
По while oroviding (orm ep rocue t expansion. It is 
realized that for test and research activities more 
computing power may be necessary than would be neeced for 


EEUU Oroduction activities. 


ШЕ ОӘсторег 75, the Computer Бати Architecture 
election Committee was formed to evaluate computer 
architecture candidates as a basis for a family af 
Etware-comcetible military computers. Ten Army and 17 
Navy oraanizations were represented on the selection 
committee И The оогсоѕе mas to select an architecture 


EE could be used as a standard, had a proven instruction 
set and an architecture which could be used in advanced 


technologies. 


eer DP=i1 FAMILY 


The committee voted that tne РОР-!| had the best 
architecture for use in the Military Computer Family. 
However, it aenerally contained a small address space and 
possible SIG OOlmt imstruction compatability oroolems 
with existing systems. The IbM system 570 was ranked second 
the Interdata 8/32 гапкес third (121. The Digital 
Equipment Corporation BPDP-11 series provided а popular 
example of both the orice and performance excellance in 
available computer systems. Their popularity 1s evidenced by 


Bue shipping of МИГ ВРУ ООЧУ and 10,000 PDP-11/304 





computers as of 1975, 197/6 respectively [28]. relevant 
PDP-11 computers considered were the PDP-11/04, PDP-11/34, 
EE C755, PDP-11/55, PDP-11/60, and the PDP-11/70 (listed 
rom least powerful to most powerful). What follows is a 
brief descriotion of each system. Unless otherwise stated, 
Ш will be assumed that the more powerful system will 
contain all the features of systems less powerful. The 
ШОШ] 1705 and the LSI-11 series were not considered due to 


г по having the advantaces of the UNIBUS [28]. 
1. РОР-11/04 


The PÜP-11/08 is the smallest computer of the PDP-11 
Series, containing the entire central orocessing unit on one 
meara permitting room for crastic expansion due to unused 
chassis area. The system contains) self-test No GG to 
determine system operability every time the orocessor has 
power applied, the console emulator is used or the bootstrap 
routines are initiated. The console emulator allows the 
@eerator to control the system from a terminal without 
mmysicalily throwing switches or reading lignts on the front 
Е of the unit. The bootstrap loader automatically 
restarts the system from various peripheral devices without 
wcd of ohysical switch tnrowing. Memory size varies from 
ШІ Әусев (о 56K bytes (8 tits = 1 byte) of either MOS 
(metalic oxide semiconductor) or core type with an average 
access time of 5S00-nanoseconds and system cycle time of 


l -папозесопа$ 00 A typical cost of this 5/5сет 15 





25950 i29]. 
РВ РОР - 11/54 


ВЕН РОР- 11/34 15 the next size of the PDP-11 family 
and 15 Ше ТӘнесб!б architecture to contain a memory 
management routine to provide proaram protection So user 
NDOOPams cannot access or chanae system memory space. ern 
ШИЕ І!І/08 jit is the orogrammers resoonsibility to maintain 
and protect this area.) Memory management also allows 
mutua! memory pagina of uc to 16 paaes ranainga in Size from 
Т bytes to 8K bytes fcr a total possible memory of 256K 
memes Of which 128K is physical. (The highest 4K of address 
ED on the PDP-11/34/45/55/600/70 is used for registers 
Meat store I/0 data or status of individual peripheral 
devices. NS means that the 11/54 can physically address 
Meornme bytes but virtually address 256K bytes.) Тһе 11/34 
allows Both core memory and MOS memory to be ¿used 


momcurrentiy. 


lNEUSWPDP-11730 also contains a memory option called 
cache memory which is a ОК high speed (300-nanosecona cycle 
пе) memory used to store a copy of the the most recently 
Selecteg portions of main memory affording faster access of 
ЕОс Топ and data. The "hit" time or time the next 
access is resident in cache is approximately 86 percent for 
the 11/34. Time is saved by less area to access, therefore 


less Search біте, and shorter less complicated data 


Bransmission,. Sınce MOS memory 1$ молата (loses 





information when power iS removed), the 11/34 has a battery 
ВЕК option which will retain information in the MOS 
memory arc approximately two hours. The РОР=11/54_ сап 
operate in two modes, Kernel and User. This two mode 
concept is important in security since the User mode is 
prevented from executing certain instructions that coula 
cause modification of the Kernel program, halt the computer 
or use memory space assigned to the Kernel or other users. 
Monitoring and Supervisory routines are executed іг the 
Kernel mode. Mesene User concept is Important since її 
the Kernel can be made secure, the overall security of the 
operating system from accidental harm is much easier to 


вече. Prices range from $11,080 to $53,800 [29]. 
S. PDP-11/35 


Tne PDP-11/45 system is designed for speed. The 
high-soeed central processor allows program execution of 
Bree million instructions per second and has either 
500-nanosecond bipolar memory or 980-nanosecona core memory 
ВТ 61 е. MOS memory is also available as an  "adc-on" 
option. Total memory space is the same as the 11/34. There 
ЕЕ OGOOtional floatina point orocessor to handle double 
Brecision arithmetic. The system is escecially good for 
mMultiole-task apolications, otherwise it is the same as the 


I 54. The price is $41,800 [29]. 





fem OP=11/55 


The PDP-11/55 system  imoroves on the 11/45 by 
Eertina a dual bus structure to allow intermixing core and 
bipolar memory (up to 243K with memory management) to 
optimize system performance. Two separate semiconductor 
controllers allow simultaneous data transfer for increased 
System throughput. Both the 11/45 ana 11/55 hardware have 
been optimized towards а multiprogrammina environment by 
ШОК а 10 по a tnird mode, Supervisor, to control System 
Ma tion while огорегіу handling multi-user operations 


ESO). me orice is $50,400 to £80,780 [29]. 
S. PDP-11/60 


The PDP-11/60 system is the interface between the 
mid-range mini аас бе поге powerful mini. With the 11/60 
ЖИЕ ФЕ the first caoability to microprogram and four levels 
mre priority Ее Оре The system was also desianed with 
the enaineering trade-off between ease of maintenance and 
Reliability Ша тула» АЕ сус сет C ha мегут еше Со 
repair after failure may oe less useful than an easy system 
Near that fails more often. The availability of the 
System is a measure of mean time between failure divided by 
Bree quantity mean time between failure plus mean time to 
S31. (MTBF /(MTBF + MITTR)) (50). Digital Equipment 
Corporation has tried Domo gres complex 
Mei tecture (probable higher failure rate) by providing a 


Reliability ana Maintenance Proaram (RAMP) software package 


26 





 пеіс locate software anc hardware errors, decreasing the 
MNR thereby increasing availability. The price ranaes from 


E2400 to over $200,000. 
6. РОР-11/70 


Тһе PDP-11/70 is the largest of the PDP-11 series 
and gives the power of a large computer at the cost (363,000 
mu»r445,880 [29])) of a minicomputer. It was designed to 
operate ìn high-performance systems and is iaeally suited 
real-time systems due to the hiah speed of execution and 
EN 0-95 percent "hit" ratio of cache memory. Ädaressing of 
over four Megabytes of physical memory 1$ theoretically 
Bossible with the ee bit addresser, although 250K of this 4M 
must be used for the UNIBUS referercina. (The UNIBUS can 
only address 18. bitsy therefore the memory manacement 
routine must convert the 4 Megabyte addresses as 1f |t were 
гена! location.) At the present time however only eM of 
physical memory can actually be accommotated oy the UNIBUS. 
ЖЕГЕ TE the option to use 64 bit floating point numbers іп 
calculations. With two megabytes of main memory there iS 
ШЕ Пе concern for memory constraints during a multi-task 
environment. ne som Lom of attaching on Speed mass 
Storage devices to the central orocessing unit through 
eC ated paths is available. Ihe system has eight levels 
 Бгіогісу and a large amount of flexibility in its 
programming making it possible to run severa! levels of 


display devices under varying loading conditions. 


e 





OARRA PROCESSOR 


An Array Processor iS an unit capable of performing 
ШОШО ла point operations on large data arrays or data 
streams. It usually operates as a peripheral device to a 
"host" computer system and best performs the repetitious 
reiterative operations requiring a large number of 
 Пастопс and multiplications tyoically encountered in 
Ex calculations such as correlations and fast fourier 
mransforms. This system is special purpose and cannot 
ШОКЕ for itself since it has no executive functions 
exceot those necessary to control the mathematics required 
ШБегісгт additions, multiolications and data movement 


¡A 


With an array processor, large tramsiorms can be 
Bchieveo dependent only on memory e pecu These 
EN forms can be done faster than in the normal CTU since 


array orocessor performs only one function at a time 
(here function is used in the broader sense as in 
transposition) and there is no need for the normal overhead 
EUtrol logic of a aenera! puroose comouter (ep Tais OAS 
more advantageous than a Srecial purpose comouter in that an 
ВУ processor can be orocrammed to execute various array 
 ЕСесзіпо applications and can also act as e peripheral. 
Ideally à system would be wanted that could handle any size 


E Usncludina the possibility of very large arrays if the 


28 





ENUStuon warranted. [nis is theoretically possible by using 


sequential processing anc Stringing a series of array 
brocessors together having each perform a Specific 
Operation. That woulo only бе  aood,; however, Cor 


Вт са топ not needing results of data processed in step N 
moe be used in со C CN-zle Deino mohe “әпгпгау Processor; 
EEN and sufficient performance of large arrays is 
possible due to the soecial architecture and memory of the 


meer yy, OF OCeCSSOPr. 


Ш сепега! ourpose array orocessors oresently seem to 
Mate the market. These are the CSP Inc. MAP-3U0 (Macro 
EN Processor) anc the Floating Point Systems AP-I2UB. 
While the basic function of each is sımiliar, tne actual 


END r17on is auite different. 


ve theoretical advantace/disaavantage of each 
processor wil] be discussed in detail comparing 
EEUU tecture, ooerational characteristics, software support 
ana Epoocramabilitv. Cnapter MED Lome fusions anda 
Recommendations, "M en sceuss the actual oroblems 
E Uuntered with спе installation of the МАР- 500 system to 
De used in the evaluation here at the Naval Postgraduate 


Bchool. 
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IN E? e OB ARRAY PROCESSOR 


A a a en, 


mmemer-ic08 Array Processor (fig 5) is manufacturea by 
eating Point Systems Inc., Portland, Oregon. It operates 
BSymchronously using a lo/=nanosecond cycle time master clock 
synchronized with a 50 percent safety margin every cycle for 
worst-case temperature and voltage. The system uses ore- 
conditioned  meaium-scale integrated circuitry, large-scale 
EN агасес circuitry and transistor-to-transistor logic. 
mmemear-ic0OB 15 capahle of operating in temperatures from 10 
to 40 degrees centigrade at 0 to 90 percent relative 
humidity. This processor 1s also able to ovperete usina one 
of these various power options; 105/125 УАС at 120 amps, 
180/228 VAC at 10 amps or 210/250 VAC at 10 amps with eitner 


О ВегЕ2 ог 50/400 hertz available (71. 


ТРе АР-!20В emoloys а technique known ас pipeline 
processing to increase throughput. Pipeline processing 
Meni azes a combination of tne elements of both secuential 
processing and пага! е} processing. A sincle basic 
к еззог, like an adder, is logically divided into intecral 
MANS that can each perform a specific and separable 
ШООК топ while another unit of the adder simultaneously 
ШЕРТОГТП5 another function of the addition task. When one 
task ıs completed, it will move on to the next step ın the 
sequence allowing the just vacated section of the aader to 


Ea lled witn the next task in the queue. Tnrouaghout 15 
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TESS by insuring that the entire system is always full. 
MS technique works with both the adder and the multiplier 
ЕРГӘ АР-1208, Pipelining is good for vector operations 
Smears vectors are basically independent and a Solution of 
vector N is not needed before vector Ntl can be started. 
However scalar coerations are basically sequential 
erations and cannot таке use of oipelinins IT By 
carefully considering every operation, especially those in 
loops, the programmer can squeeze more operations per time 
Enpeprval by pipelinina than would be possible usina standard 
sequential techniaues. The time is generally limitea by the 


Euoil3cation time [14]. 


E ~1 CUB instruction word іс up to 64-bits lona and 
ЖЕП БрегтТтогт а maximum of ten different operations in a 
single cycle. As an examole, an add, a multioly, a move to 
and from each Gata oad (there аге two) and an address 
increment or decrement can all be performed in the same 
Evcle. one instruction or combination of the above can 
be performed as lona as the resource required is not being 
used in another operation (some operations are multi-cvcle 
and "lock-out" the resource until they are comolete). It is 
the proarammers opliaatıon 0 insure that all required 
resources are available when they are requested or else they 
will be lost В As an example; a reao from a data pad 
takes at least two cvcles. ІТ cycle № wanted to read from 
Data Pad X and cycle N-! already initiated a read from Data 


ESL the entire instruction word for cycle N would be 


D 





delayed one cycle waiting for the resource to become 
available. This ability to perform more that one basic 
operation per cycle allows а theoretical 30 million 
instructions oer second to be executed. Due to memory size 
limitations and algorithms not needing ten operations per 
instruction word for sustained periods this rate can never 
ВЕС Ту attained exceot possibly for short bursts Во 
Since some of these operations are housekeeping functions, 
the maximum number of arithmetic operations per second 
ога} са] у possible is twelve million for vectors and 
five million for scalars (scalar speed is much lower since 
it requires Sequential processing and cannot take advantaace 


E scelining) til. 


The AP-120B uses a 38-bit data fyord which Floating 
point Systems contends Generates better accuracy than the 
32=-bit word commonly used by other systems [7]. Ins © OIC 
word consists of a ten=nit biased exponent and 28-bit twos 
compliment mantissa thereby allowing numoers ın a ranae of 
Ж ЕЕ 10 ** -=155 to 6.7 * 10 x* 153 to be represented. The 
ВОт: mantissa allows for extensive calculations without 
ЕГІ Тісапс truncation errors or a maximum relative error of 
approximately 7.5 х 10 хх =9 сег arithmetic operation or 
EE 3 decimal diait accuracy. Floatina Point Systems Inc. 
also employes a technique known aS Convergent rounding which 


mrey assert forces the roundoff error to approach zero. 
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Тһе АР-1208 does not contain the normal bus structure 
of other array processors but insteaa uses dedicated 38-bit 
Mere Caths for the movement of data. There аге two paths 
ШКОЛЫ ә to the adder (one for each input register), two 
paths to Че multiolier anc three paths available to the 
memory and data pads. This allows seven independent data 
memes. tO be transferred each cycle. (This coupled with ап 
acd, multiply and address increment/decrement, eauals the 
memeymstructions oer cycle possible.) These separate data 
paths eliminate the neec for a handshaking arrangement 
between logic elements, although hankshaking 1s reauired 


when the AP=-!120B communicates with the host (7,30). 


Me price of a unit which includes the АР- 1208 array 
erocessor, interface with the BOREL, Don words хот 
5$5-nanosecond interleaved MOS memory, expansion chassis; 
mest allation,s Обо words of program source memory, 512 words 
of Head Only Memory (ROM) table memory, a linker, loader, 
Symulator, debugger, а сае ср library and executive 15 
$50,970.00 [10]. This incluges a 90-day warranty with a 
Servicina agreement availaodle at extra cost. The fiela test 


meme time between failure is 3500 hours [3]. 


The following section exolains the hardware of the 


M1206 in detail. 


OS CHARACTERISTICS AND HARDWARE 
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8 Multiplier 





ioe Malrtiorier unit (fig 6) consists of two 38-oit 
КИЕ Бі тег registers M1 and M2, three multiplication stages 
mma 20-bit register to Store the result (FM). To receive 
a resultant after initiating the multioly, three cycles or 
500-nanoseconds are reauired. Inputs to the Mi register can 
Rue trom Data Pad X (DPX), Data Pad Y (DPY); Table Memory 
(TM) or the Multiplier result register (FM). ро 22 
Ре етспег from DPX, DPY, Adder result register (FA) or Main 
Data Memory Output Buffer (MD). Results from the multiplier 
meme go to Mi, the Adder  inout regGister (A1), Main Data 


Memory input buffer (MI), DPX or ОРУ. 


ace one of the multwolier starts the product of 
TIONS Бу beginning tne multiplication of the twọ 28-bit 
mantissas. nos multiplication 1S completed in stage two 
resulting Mita S6-pit rantıssa. Stage three adds the 
exponents as it normalizes and convergently rounds the 
So-bit Tantıssa to ¿8-b1ts. This stage also detects 
exponent overflow/underflow and if either exist will set the 
Но Бе in the status register. The status register 
ШІП бе read by the program to determine; if conditions are 
ret ет an arithmetic oceration, to specify errors; or to 
вед in branching logic. These bits are available for 


опа one cycle after completion of the multiply. 


This three stage multiply allows pioelinina to бе 


used since each stage 1S independent of the other two whicn 
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permits a multiplication result to be present at the result 
register every 1l6/*nanoseconds once the pipeline becomes 
ful | (three cycles required to filh. Note that 
S00-nanoseconds are recuired vx the result of the 
EuDD5lication is required in the next multiplication as is 


ШИЕ Сабе with scalar arithmetic. 


A readily apparent problem with the multiolier is 
that Mil receives inputs from both the Table Memory (ІМ) апа 
the Multiplier Result reaister (FM) while M2 receives inputs 
NEW neither. Therefore, if a constant from TM were to be 
 Ігітес бу the result of a just-completed multiolication, 
ШОШО! а require an extra two cycles since either FM or ІМ 
КООШО first have to be written into DPA or ОРУ and then 
written mato “Me. This disadvantage is overshadowed by the 
fact that even though dedicated data lines cause the above 
problem, Teno ae o” eaS t ey present a distinct advantage by 


Ano multiple data transfers in any given cycle [32]. 
p Adder 


Tne operation of the adder (fia 7) is Similiar to 
EU of the multiplier and consists of two 38-bit adaer 
registers Al and A2, two adder stages and an adder result 
reaister CFA). The ‘adaition of two numbers requires 
E nanoseconds (two cycles). Inouts to Al are from Taole 
Memory (IM), Multiplier Output register (FM), Data Pad X 
ВВ), Data Pad Y (DPY) anc the ZERO constant while inputs 


to A2 are from the Adder Output register (FA), Data Pau X 


on 
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(DPX), Data Pad Y (DPY) anc the ZERO constant. Tne results 
ШІ спе adder can ao to Ae, М2, DPx, DPY or MI. Stage one 
Ш 1805 the mantissas by shifting the smaller value, based on 
MeN ollue of the exponent, to the right until both exponents 
are equal tnen adding or subtracting these mantissas. otage 
two normalizes and convergently rounds the mantissa ana 
adjusts the exponent. This stage also sets four bits in the 
status register to denote results equal zero (FZ), results 
ШЕЕ АЕ Пап zero (FL), exponent overflow (FQ) or exponent 
 бегтіся (FU). These nits may be tested by other crogram 
Mmeerructions one cycle after the addition is completed. 
ШЕГЕ that FO and FU are the same bits that are set by the 


Mmmm ier on exconent overflow or underflow.) 


AS with the multiolier, the two-stage adder allows 
pipelining ana - result can be generated every 
Ш“папосесогов. The adder dces not have the disadvantage 
Of inputtina Table Memory (TM) values at the same register 
as FA but does have the multiplier result FM at the same 
adder input register (A2) as TM values. There is therefore 
not the abilitv to immediately add a FM value with а ІМ 


ЕУ слог first goina throuah DPX or DPY (521. 


For both the adder and the multiplier there would be 
a two cycle time loss if FM was just loaded with a new value 
Eom the multiolier when it was needed for the 
EN rtion/multiolication process (time N) and only a one 


Cycle loss if it was ready the cycle before needed (time N = 





TIE). Otherwise there would be no loss of time since 
steps could be taken to move the value in FM through the DPX 
or ОРУ which would make x be available at the 
adder/multiolier aout register when necessary. 
Meresuboosing of course that the паа paths to ог from 


memory were not needed for other uses.) 
BEEN 5-Рас 


Mae 5-Pad (fig 8) (pseudonym for Scratch Dad) 
веть of the S~Pad Memory, S-Pad Aritnmetic Loaical Unit 
ШАШУ Data Fad Aadress Register (DPA), Memory Aadress 
Register (MA) and the lable Memory Address Register(IMA). 
The sole purpose of the S-Fad is to compute addresses for 
Table Memory, Main Data Memory ana the Data Pads. The S-Pad 
Operate concurrently with the memories, Multiplier and 


Adder [7]. 


The S-Pad Memory is made uo of lo registers each lo 
КООШ wide Giving the ability to compute an effective address 
of бак, These registers may be assianed label names like 
"Pointer" ру the use of cseudo-operators, to make ocrograms 


more readable, or mav be ai1rectlv addressed by number. 


Sg Pad Arithmetic Logical Unit forms the operand 
addresses and also automatically looo counts, shifts the 
addresses left once (divide by two), shifts the adaresses 
ге once (т | Г тр у Оу O O АВ twice (multiply by 


Sur). There is also the ability, i f геач і геа, ol DIC 
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reversal, to swap bits while accessing data in a Scrambled 
meer after а Fast Fourier Transform. The results of the 
Eo arithmetic logic unit, called SPFN, set bits in the 
Status register to indicate whether the results were less 
ШЕШ “его (М), гего (7) or if there was a carry bit CCDE 
mese Dits are available for testing by program instructions 


Ж ИН ПЕ Next instruction cycle. 


TMA, DPA and MA stcre tne computed address from the 
S-Pad ALU. The contents of each can either be chanced by 
the value of SPFN or incremented by опе. One cycle is 
E Nred to compute the address and load it 1nto tne oroper 


register [32]. 
T Table Memory 


Poole memory is a 512 мога, 38-bits per word bipolar 
ШОО ОП\у memory used to store important and much used 
TENTS. This memory has a 1l67-nanosecond cycle time but 
Mmeewires two cycles to get the value from memory to the 
output register TM (71. values in IM are available for use 
by DPX, DENS MD, МІ апа ^l. These values may be requested 
every machine cycle and are ayer ateos Dy стапатпа the 
Eontents of the Table Memory Address Register (TMA) in the 
es ad. The orogrammer must control the timina necessary to 
insure the correct constant is at IM when needed due to the 


КИШ УС | е access time reauirement,. 
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паев fourier Transform Mode, the address in 
TMA 15 interpretted by the hardware to be the angle which 
NS. to the appropriate root of Y {Ore a’ “earticular 
Seer Іп the FFT aloarithm, Therefore, in a single auaarant 


COS ines, a full table can be represented (52). 


There is an optional Random Access Table Memory 
 МКАМ) containing IK of random access memory [81. This 
ENS leading of soecial constants necessary TOn scecial 
Bpiscations without the overhead of comouting them every 
time or usina valuable data pad space to store them. The 


ВЕ ОТ this option is aprcroximatel!y $1850.00 (71. 
EU bata Pad X ana Y 


Beate ads (fig 9) consist of sixty four 58-611 
accumulators, four of which are available from the lo 
addressable each Past FUG EI ON cycle [n] Tnese by 
mu lators are divided into two 2-register blocks called 
ES Pad X (DPX) and Data Pad Y (DPY). From each Data Pad, 
one reaqister can be read ana another written during the same 


cycle. 


The restrictions are that the same reaister cannot 
be read and written simultaneously anda that a read ana write 
Operation during the same cycle must occur on registers 
whose addresses differ by no more than 7 gue to base- 
mearess-plius~offset addressing. (However a register in DPX 


may be written at the same time as a register in DPY even if 
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they both have the same address.) In the 5-Раа, the Data Pad 
Address Register (DUPA) supplies the base address to be used 
Ne read/write instruction to locate the proper Data Раа 
rearster. I XE supplies both DPX- and DPY concurrently. 
The instruction uses this base address and an offset in the 
form UPX(offset) or OPYloffset) and can address -U to +3 
offset from the base in each Data Pad to find the effective 
address. Therefore 1% the DPA contains decimal value 20, 
Bearsters 16, 17, 18, 19, 20, 21, 22 апа 235 сап be addressed 
іп each data cad. The reaister addresses of both Data Paas 
ШОШО Тгоп 0 to 37 (base &) and are arranged in a circular 
addressing scheme. Iher re orem (Басе 9) +1 = 0 and the 
mmoaoranmer need not be concerned about writing into a non” 
exiStant location DUC must only be concerned with 


Menar iting previously written information. 


DPX and DPY receive information from MD, FA, FM, 
ОРУ, output of the S-Pad arithmetic logical unit (SPFN) 
and VALUE (an immediate value used by immediate instructions 
Evang from the command buffer). Шара DPY susoly 


memes (со МІ, М2, Al, А2, DPX, DPY and MI [32]. 
0. Main Data Memory 


Кы Вага Memory (fig 10) contains 64h 38-bit words 
used primarily to store inputted data which will be operated 
оп Oy the orogram. This memory is available in two forms, 
16/-nanosecond nardware interleaved MOS with 4К мога 


segments or 333-nanosecond hardware interleaved “OS with AK 
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word segments. Both memories have a two bit parity option 
available (7] ana a one megaword page selection option [9]. 
With memory limited to o4K, the largest complex-to-comolex 
Mest Fourier Transform cossible is 32K, which may not be 


Mmeeemraolie in some apolications. 


Шайп Data Memory receives input information into its 
Barry Input Buffer (MI) from FA, FM, MD, DPX, DPY, TM, SPFN 
ema VALUE. Е ОЕ та the Memory Data Buffer to DPX, 


в, Ae and Me. 


Memory read or write may De requested every other 
cycle by chanaina the value of the Memory Aaaress Register 
En the S-Pad. This yields an effective memory cycle 
time of either 333-nanoseconas (lo/-nanoseconds plus one 
maemine cycle) or 500-nanoseconds (333 plus one machine 
cycle) dependent on the tyne of memory ınstallea [32). By 
на! programming tecnnicues and orocer chio procurement, 
this overhead can be reduced to the advertiseo memory speed 
with the restrictions that the memory alternate between 
chips ог alternate between even and odd boundaries. Ier 
effective speed is essential, it Decomes the programmers 
responsibility to insure data location 15 know to the 
ргодгат at al]! times(ídl. A read reauires three cycles for 
Eus rmation to be present in the MD 1f using $55-nanosecond 
memory and two cycles if using l6/"*nanosecond memory. This 
information will be available until a new value overwrites 


ПС. Ша write or read Ше initiated before two memory 
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E Uesc(tunless soecial chics and techniaues of above аге 
used), the request wil) not be lost but the memory wil] 
automatically provide a hardware lockout (wait until memory 


NS Table for read/write) [14]. 


The value in the Memory Address Register (MA) points 
(Пе desired location in main data memory. MA may be 
either set to a soecific value or incremented/decremented by 
Ome іп tne S-Pad. Since there is a slight time lag between 
when a value is requested to be placed in MD ana when it 


actually gets there, the crogrammer must always be aware of 


ve tt 


EU values are in MI and MD, to allow the proper set up 
time to aet these values to either the Adder, Multiplier or 


Ее DPX, DPY or MI address [32]. 
Г. Program Source Module 


ие Ргосгам Source Module (fig 1!) consists of the 
агат Source Memory (PS), Proaram Source Address Register 
Б ЕЙ Control Buffer (C3) ana the Subroutine Return Stack 


ESO) (32). 


The PS is wo» speed, 50-nanosecond, отволаг 
memory addressable to 7К 64-01% words and is available in 
056 wora increments [4). The PSA contains the address of 
the next Baseruietion. and is incremented by one after 
instruction execution unless modified by either the Control 
Ber (mew address as a result of a branch or jumo 


Beet ruction) or the Subroutine Return Stack. Tne SRS saves 
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ШИЕ Спггегпе PSA when a Jump Subroutine instruction is 
performed and increments the value of the Subroutine Return 
Mess (SRA). When a Return instruction is performed, the 
SRA is decremented by опе making nested subroutines 
Possible. тһе Control Buffer decodes and executes the 
Ms truction as nhoeePUEwould тп 3 General ouroose computer 


“ 


EI. 


B Paterface witn PDPe-il Series 


The interface unit with tbe PDF-1I series contains 
major segments, the Front Panel and the OMA Controller 
and Formatter. The Front Panel contains three registers and 
ШО деа mainly as a debucgina aid while the DMA Controller 
and Formatter contains rive "@eqgtvsters and 1S used for 


program and data entry or removal. 
ES pront Panel 


The Front Fanel (of 12) Gomsists. Of three 
ló-cit registers, the Switch Register Comm), tne Liahts 
Bester {LITES) and the Function Reaister (FN). Lie Mere Om 
Pane] sea for  bootstrabpo)!ng ana  debuga)na of user 
Programs. These three reaisters can be examined oy the host 
and take the place of the toggle Switches normally on the 
front panel of the console Іше)? ИТЕ tne use of the 
Debugaer oroaram, these registers can effectively breakpo1nt 
the AP-120B at a selected crcgram location or data address. 


Ш 165 Front Panel allows each program to be single-stepped 
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such its execution sequence [6,7]. 


The Switch Register ег Ее host 
Eputer but сап бе read by both the AP-120B or the nost. 
The SWR is used to enter data and addresses into the 
ВОВ, primarily for debugging. Its contents can be fea 


Bere DPX, DPY, MD or the S-Pad. 


[Re Г залез Register simulates the front panel 
memes of the console. Ihas recister 15 set by the AP=120B 
Ean only be read bv the host. LITES is used to display 


wetted contents of the internal registers of the AP-1205. 


The fimon РЕГ IS Ece FUNCTION register 
which provides сог Canel togale= like. controls to the 
ОВ. The Fu can ston, start, Steo or reset the AP-120B. 
It can also continue oceratıon resumina at the current value 
the PSA, examine a recister, examine а portion of a 
register or memory contents of a selectec area, deposit the 
Ecntents of SNR into a selected reaister or memory location 
ana then breakpoint L Or RNS EGO therivalues of TUA, MA Or 
BEA. Е сам 2150 i¡nerement the ІМА, МА ог ОРА after 
Maite tion of an instruction to facilitate stepping through 


memory locations (521. 


Mme Front Panel is advertised to be Invaluable 
in troubleshooting when used Mo ae tion with the 


Mate ractive Debuager routine. 
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De ПИЛА О го! 


The DMÀ Control is the second half of the 
interface ann consists of three 16-bit registers, one 18-bit 
register and one 38-01% register. DMA Control is 
responsible for transferring proarams and data between the 
AP-120B and the host comouter. This section of the Front 
Panel “111 SuSE doe format" "conversion "on the fly" which 
should effectively alleviate time lags [32]. Four types of 
Gata transfer combinations are possible, host DMA to AP-120B 
DMA, Rost DMA to AP=-120b Programmed 1/0, host Programmeo I/0 
КОШ СОВ Programmed I/0 and host Procrammed I/O to АР-1208 
ОМА with a maximum theoretical nurst transfer rate of three 


megewords per second for all types of transfers (7). 


Mae Format Register (FMI) is а 38-bit double- 
buffered register used КО berto raid transfers of 
EEcSGPIng-ooint numbers from the host to the AP-12UB Соол 
The FMI will convert 16-бі? integer numbers to 509-017 
Ecrmalized floatina-point numbers, 52o-bit PDP-11 integers 
Ғо 52-61+ АР-1208 inteaers and 52-Біс floating-point numbers 
to 58-bit floating-point numbers. All these operations are 
EN PvVverse for the AP-120B to host direction (7). Since the 
РФ ЕЕ! isa l6-Dit computer, it will access tne Formatter in 
ШЕСІС half-words to be compatible. it must be notea that 
for some applications, such as difference filtering, there 
ES possiblity of extreme accuracy loss due to lo~bit 


mmeeger to 38-bit floatina-point conversion. The synthetic 
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meecrsion aenerated by such a conversion can cause certain 
coeffiecient EOmoimat ions, such as +l апа -і, when 
ВЕТ plied by mirrored arrays, to result in errors when 
reconverted CONO DIt ormat. The programmer must be aware 
of these possible losses and test for them before faith is 


КОЮ еа іп the result. 


The АР Direct Memory Address Register (АРОМА ) 
TS to consecutive locations in AP-120B Main Data Memory 
Ba OMA transfers. [his reaister can be automatically 
incremented/decremented allowing blocks of information to be 


EN nto consecutive locations with minimal overhead. 


The Host Memory Access Register (HMA) operates 
 ГГтаг (с сһе APDMA excect it ooints to consecutive memory 
TONS in the host memory. la the PDP=1]11 this memory 15 
EESK so the НМА 15 ac .ogtcrocallsNESForrthis addressi)nmag 


m3bility. 


Шс лога COUR Register (YC) counts the. number 
ESTOS transferred during a DMA operation. This register 
must pe oreset to the required number of words ano will stoo 
ОМА ШрапсТег when the prescribed number of words is 


transferred. 


The final and most inportant register in the 
terface 15 ШИЕ o aio Register (CTL). It controls the 
mm ection ana mode of transfer, type of format conversion 


гот Зе certain status bits oertainina to the transfer. 
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This registers with the use of HMA and/or APDMA, allows the 
host to execute other programs and be interrupted when the 
DMA is comoleted. This CIL also allows either the host ог 
ВОВ to contro! Ane “аса transfers (Тһе AP-120B must 
control transfer from a loaded procram since the executive 


EE us not powerful enouch to contro! data transfer (52] .) 


РЕ SOFTWARE 


Various сс лге SUPOOFT, executive and oevelopment 


EN Uup3ms are available with tne AP-120B. 


I. Executive and Associated Routines 





The AP-120B provides executive and housekeeoina 
routines ГО increase the effectiveness cf operation ana 


enhance program development. 


a. APMATH 


APMATH is a series of approximately 150 (8) 
ШШ агу functions, vector and matrix subroutines and signal 
BEoocessinga algorithms [7) written п 20d assembly 
language Iu резе Foutines are callable from either host 
ШОО зп, host Assembly or AP assembly tlanauages [56] with 
the use of the AP Executive. These programs can reduce the 
run time and decrease programming time by presenting some cf 
Ene most common array processing functions in subroutine 
callable form. These routines include: data transfer ana 


ШООГО, pasic vector arithmetic; matrix operations ana Fast 
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wer Transform; all of which are able to work with both 


El and complex data. 
D APEX 


AME etme АР Executive routine which 15 
resident in the host ccmouter and allows the AP=120B to 
memmunicate with the host computer via Fortran or host 
Assembly language calls. APEX decodes subroutine calls from 
Mmremmost comouter [56] and airects the AP=120B to perform 
the specified action. Both APMATH routines and user written 
routines may oe called by the AF-120B from the host computer 


E]. 
ев APAL 


The AP Cross Assembler (APAL) is a two pass 
assempler written in Fortran IV which requires UR memory in 
the host computer to operate. APAL assembles source text 
written in AP Assembly lanauage otoa CDU Bt. code 
understandacle бу the Рег ше assembler also 
опа! |у огосоисевс an AP Assemoly listing containing errors 
in ooth passes, oca ОЛ соз его assemblieaq data, the 


symbol tanle and source statements. 


EE TEIN 78S signed constants rangimo. trom 
EE, to 32767 and unsigned constants from 0 to 65€35 both 
of which may be represented in binarv, octal (default base), 
decimal or hexadecimal. Ns К ее formatting but 


Eecognizes the General source statement form: optional 


ES 





Exbel followed by a colon, mult:ole op codes separated Бу 
Mco lons (one to ten operations which total no more than 
64-bits. Sixty four-bits is the maximum dictated Оу seven 
meta transfers, one add, one multioly and one address 
increment/decrement), and an optional comment statement 


Mmenoted with leading double quote ("). 


Once the modules are written, APAL can be 
operated dynamically, allowing the proarammer to build the 
program at assembly time. APAL will question the operator 
about the source file name, destination file name etc. ang 
subsequently will prompt ^im concernina missing items. La 
there are errors in the module, these can be changed 


dynamically without reassembling the entire module [2841]. 


ООО АР [Мк 


ОСОРЕ ИУ ег АРЕСТИ) is written in Fortran IV 
and requires  aporoximately 10K of memory ЫЗ the host 
er. APLINK performs functions similiar to those of 
any other link editor which include relocation and assigning 
absolute addresses to the object module, correlation of 
alobal entry symools in one module with external symbols in 
the other moaules, loadinc the module [com (не. ргосгат 
SY and production of the final load module. These 
functions are performed interactively with dialocue between 


BZEINK and the user at the console. 
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Besides linking the modules, APLINK returns to 
the console any symbols ina file which are undefined, wil] 
output the symbol table ana locations when requested ang 
returns the high address and starting address to be used 


Ш ЕП the Debugger routine {5). 


e. APSIM 


В те ПБ АР-І2ОВ simulator and is деѕіапеа 
be used when developince programs when use of the AP-1208 
mmrmoractical or imoossible due to production schedules. 
APSIM emulates al! hardware and timing characteristics of 
ве АР-120В аз well as performing the mathematica! routines 
ШЕ СіІосеіу аз ровв1біе го (һе мау (һе АР-120В woulaG perform 
them {32}. APSIM requires 32K words of memory in the host 


momouter (11. 


Im APDERDG 


APDERUG Не Pe 1-Е interactive debugger 
program ШЕТПЕ dynamic debduqging of AP-120B 
applications ocroarams at run time. Changes can be made when 
the proolem is identifiea and APDEBUG w11!1] cal! the APLINK 
and APAL routines to insert the new object module then 
continue with oroaram develooment. APOC ROCI oan work їп 


EENUnction with the simulator or the actual hardware (51. 


ШЕ: TESTING Software 
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There are three software modules available to 


momoletely test the AP-1208 hardware cperations. 


APTEST ES the AP-120B path tester. ЕЕ ІС 
software exercises the panel, DMA interface, internal 


registers and memory to check for proper operation. 


APPATH tests the internal data paths of the 


E T0B and returns diagnostics upon finding any errors. 


Forward/Inverse Fast Fourier Transfer Test 
IFFT) verifies Correct enenat Von ot the АР=]208'5 
Srvtnmetic units by performing Fast Fourier Transforms and 


inverses them comparina results with standard answers 1221. 


These packages can be used to help ınsure proper 
Meeration of the AP-120B before development or actual 
Ereration and also helo with the hardware fault поса me 


effort auring system maintenance. 
Q. Programming Language 


ВЕ МЕН Library of AP functions can be called by 
the host Assemoly Language, Fortran ог the AP Assembly 
Language (501. However to write a custom library function, 
AP Assembly Language must be used and the cross-assembler 


ШИЕ translate 1t into an executable routine. 


Investigating the programming language 1$ not 


important here except ЕС зау that it 1$ шт таг in 
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BEscteristics to other assembly languages. There are 
Sufficient commands available to write a program to properly 
Euro! AP-120B execution in an efficient manner. See 
testing, conditional branching, flaq settina and arithmetic 
EENPUctions all are part of the instruction repertoire 


ENS | оз var'ed aoplicaeations programs to be written. 


Расе Select Option 


The AP-1208 can alternatively be eautpped with а 
Page Select Got yon. iiss erovices the ability to agaress 
EN ENMegaword of main memory in the ÁAP-120B by using host 
main memory and virtual memory techniques. Each page can be 
up to 68K woras long (full Main Data Memory size but each 
меде must Roast soma 16 pages are available. The 
Mer Select Option increases the ability for the AP-1208 to 
work on larger transforms, but due to paging overhead, it 
may not increase the throuchout rate due to increased host 


involvement. 


Ens onton edades the AP Direct Memory Address 
Register (АРОМА) located im the DMA Control section of the 
mmeertace by extending it from 16 to 20 bits therefore ¿xx*20 
addressing capability (arproximately опе megaword). This 
Virtual memory ability is called the AP Memory Address 
Extension (АРМАЕ) and new addresses can only be loaded by 
the host. Since the host 4111] COn rol all Daging 
Esrations, the AP-120B commands will not change inasmuch as 


EN !! only recognize 64K word locations [9]. 
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Programmable 1/70 Procesor 


The Programmable 1/0 Processor (PIOP) LE MICOS 
codable micro-processor wnich acts like a hiah speed channel 
BE m ocontrollina an input/output port. It is capable of 
transferring Саба at а Six megahertz burst rate or at a 
three megahertz sustained operation rate (assuming 167 
nanosecond Main Data Memory). The PIOP can be used with uo 
to eight external gevices (like A/D converters or mass 


БЕбгаде devices) thereby actina as an І/0 bus controller. 


Eus Or “antertaces directly with the DMA Controller 
ве interface unit. Pae a SO DIC Instruction word; а 
20-bdit arithmetic logical unit and is capabale of addressing 
to one megaword of memory making |t compatible with the Page 
Ше ест Option. Communication with the AP-1208 is 
meeomolisned via one of eiaht flags and four interruots. 
mememicro code supports subroutines and has the logic to 


perform jumos witnin its own code. 


Шие! Ff lOP must тапсте al! handsnmaxing and timing 
considerations with notn the external devices and the host 
earan to Insure data integrity. This can be complicated at 
times so a Proarammable I/O Cnannel (PIOC) is also available 
which decreases flexibility but eases the programming burden 


=>). 


Neither tne PIGP nor  PIOC provides a method of 


memmecting two AP-1206's toaether in Series without host 
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q TS уе 


Erteryention which tends to mit some of the possible 


E uncations of the AP-120B, 


fee PROGRAMMING, OPERATION AND EXECUTION 


The | AP-1200B can otilize the parallel ecreratuon 
mepaoility cf его mulcıplıer anc cata transfers to 
Ее execution of the croaram and throuahnut on larce 
data arrays. These carallel ocereations must ре controllec 
нае obtiTrum execution 5сеес Can de realizec without 
NEMUS wnterjiock or tockout. Lockout coula eventually lead 
E  orooram sto pace [1]. сисе most scientific саса сап 
EE. structured into an array form, tre array processor 
ro work on it cuic*lv ana efficiently in its natural 
Nc wnere a general Purrose computer must, in most cases, 


ETT lecture it [5651]. 


IE ce the АР-1206 can work on data, the cata must first 
ШЕ “гагстеггес from itS memory locations 1m tne host to Main 
MRS Memory in the array crocessor (or moved to Main Data 
memory from an exter^a! aevice via the PIUP. That sSitUation 
Ж ЕЙ! ПОС ге сезіс with here since the PIQP is programmable 
ИПОП therefore oath and Jata options associated with it are 
EE). The data iS transferred via the interface with the 
ОГ the APPUT(nOST, AP,N,TYPE) command (Put Data into the 
АР-120В), АЕ ІТ -careurents of other AP-1208 CALL 
Statements, OST АР, N апа ТҮРЕ needa not pe exolicit!y 


Stated but can be exoressions, integers or variables. 
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The host and AP-1208 must be svnchronized in their 
ЕТО so computations can not go on while саса 15 56111 
being transferred to memory. APWD (Wait on Data) causes the 
host ШОО чат until) doto itransfer is completed before it 
resumes executina the oroaram. APWR {Wait on Running) 
Buses the host to wait until the AP-120B is completed with 
one command before another is sent over. AP AAT | 15 а 
 потпасто” of APWD and APWR. One difficulty encounterea 
using these commands 15 that the host СО mem. ol tne 
orogress of the execution if nollina 1S used to oetermine 
EN UL er wR or APNATT completion or the APw=120H must wait int 
ТОСТУ interrupts are used, which nc resses the time 


necessary to complete the crograr. 


EU -Uof the overhead of the host can be eliminated by 
not ШИПП tne AP Wait on Running (АРАК), АР Wait on Data 
meee or AP ait CAPNAIT) commands. raise TECHNIC may 
meeea, UO program execution ang sbould only be used when it 
ms absolutely necessary anc when there is no chance tnat the 
Nesults wil! ce orocessec before thev are actually present 
Baeche АР-1208 Main Data Memory. Floating Point Svstems 
Buaggests that the orogram first be written ano executed with 
the APWR, APWD and APWAIT commands present ana the results 
motten. меп гетпоміпеа ә few ОТ those instructions at a 
Eme, the results can be checked to see if tney match the 
Erj3ginal results. This only works for specific applications 
SS does not conform to modern programming practices. lt is 


also extremely dangerous since it goes not allow for speed 
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memetuations due to temperature variations. 


When processing 18S Comelete, the data can be transferred 
back to the host vie the APGET() command which operates in 


Ehe same manner as the APPUT, 


The aoplication program resides in the host memory and 
the host executes this program. The host will determine 
which routines must be passed to the AP-1208 and if the data 
necessary is present in the array Processor. When a routine 
15 Calied, the host will jumo to 1t and execute it BUC if 
КООШ ne called 1s part of the math liorary (whether from 
MENA TA Cor a user written math routine), the host first jumps 
ШО APEX. Zee ENCORE oleGsetne so4—-bit instructions into the 
AP-120B Program Source Memory, calculates the remaining 
Ш c-e3vailable in the Program Source Memory, updates the PS 
cation table, loads the parameters апа initiates the 
execution. If the same routine is called again immediately, 
ENG! not be reloaded since it is already present but only 
the new parameters will be loaded. Ші а Оп ferent routine 
ШИЕ Neg, APEX will first check the PS location table to 


see if there is enougn unused space available to load it 


out destroying any routines currently residing in 
Erocram Storage. If not enough space iS available, the 
Bsst*uritten program wil] be overwritten with the newly 


ВЕ routine (Last In First Out (LIFO)). 


The overhead required for each matn liorary routine 


в | їес is between AORN OVO microseconds. Une hundred 
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Dposeconds is the minimum time required to check the table 
and move parameters. This minimum time is reauired for 
ГУ са!!, even in looping operations. During this period, 
the host must be available to the AP-120B which would cause 
unnecessary host overhead. While the AP-120B is executing 
any specific routine, о ое сан be freed to do other 
meskes and treat the AP-120B as a peripheral device. The 
host can either be interrupted or can use polling techniques 
to determine ìf the array processor requires assistance. In 
eitner case, tne programmer must be aware ot when a break 
occurs So he can insure that the proper seauence of routines 
ШЕ Чзес го allow the host to perform other operations and 


not be burdened by many AP-120B services. 


Several ways to increase availaole free time in the host 
ope to transfer more than one vector with each APPUT or 
moe) commands use octimum AP-1208 library calls to perform 
Given operations (it 1$ the programmers resoonsibilitv to 
determine which AP routines are best for each situation) and 


bverlao лоб! апа AP-120BR operations whenever possible. 


EG every call of a routine reauires nost intervention, 
several routines can бе comoined moto one ву writing а 
special пас го сототптпо those routines, wien ЕНГІ! 


effectively eliminate sore host overhead bv using only one 
"call" statememt. (But these macros must be small aue to 
limited АР- 1206 proaram memory.) Since host overhead varies 
between 100 and 1000 microseconds, with the higher value 


being due to the maximum amount of data and program 
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|! transfer, some overhead can be eliminated by loading the 


most used routines first, since overwrite is accomplished by 
IFO. APEX must also be a part of the Interr rut Tera ty 
Scheme of the host (interrupt or polling); therefore, by 
having the AP-120B at a high priority, the overall wait time 


Не system due to interrupt wait can be minimized [8]. 
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МО МАР=500 


КООШ О О Е=500 (Масго Array Processor) (fig 15) is manufactured 
ШЕ СЭР Incorporated, Burlington, Massechusetts. The basic 
structure consists of three independent busses, an executive 
routine; two parallel arithmetic units, an addresser and an 
Bonput/output handler, each having its own Cock and 
operatina їп а Parallel aswe»ronous fashion. The basic 
НЕ Units are the Central System Processor Unit РУ, 
phe Arithmetic Proceessor (АР) (consisting of the Arithmetic 
Processing Unit (APUD) and the Aadresser Processor Section 
ШЕР5)), the Host Interface Scroll (HIS) Fong an Optional 
BUZ Output DOOR HUS АКК exceot the CS3PU use micro" 
coded routines stored in their own small memories and 
communicate witn each other via flags set in registers. 
(The COPY stores its micro coded routines in main MAP 
memory.) The Host Interface Module (HIM) section of the HIS, 
ШЕ IOS ana the CSPU are built around a standard Intel 3002 


EU slice micro processor. 


Ihe representation of  MAP-300 numbers is usually a 
32g-01t Matina point format with a one-bit sign, a seven 
Dit exponent (giving a range of lo ** -64 to 16 хх 65 biased 
by 64 therefore © Со 127 are the actual numbers stored) and 
med bit mantissa allowing a total ranae of 10 жх +77 го 10 
ЕЖ Tb. Srx<teen=-bit floating-point and lo-bit fixed-point 


numbers are also available. MAP-300 main memory is 
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 ессаріс in either 32-bit full-words or 16-bit half-words 
КОО eight-bit bytes can be accessed oy packing pairs into a 
le-bit half-word 1181. SNAP=II commands like VFIX8 assume 
this Packing exists [34]. The ability to address in half- 
words and/or bytes is important as it may increase the 
efficiency of the program and array processor, allowing 
operations to be performed which may not have otherwise fit 


in a word-only addressable memory. 


Although the МАР- 500 is asychronous, the acvertised 
average CSPU cycle time is approximately /Q*nanoseconus with 
about  500-nanoseconds recuired for a memory read/write 
operation when S 500-nanosecond MOS memory 
ШМЕЗ-папосесогсс using bipolar). Fulleword operands and 
Meets starting on an odd address boundary, however, 
require about two 500-nanosecond memory cycles. A pseudo" 
operation can be used to insure even-boundary locations 


Perot [18]. 


The MAP-300 is caoable of operating bu temperatures 
Вт О to 50 degrees centigrade at 10 to 90 percent 
ШІПІСІСУ. The power requirements are eitner 115 VAC or 230 
meee single phase plus or minus ten percent at 47 to 63 


Hertz. The weight is aoorcximately 80 pounas. 


The MAP relies heavily on interna! parallel processing 
meme increase throughout and limit wait time. The MAP-500 
Stores the executive and array routines in its Own memory 


Mas opposed to storing it in the host memory). With the use 
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Function lists and statements like "MPWHL" (МАР version 
of the "DU WHILE"), the MAP can operate independently of the 
mest after initial loadina of the program I With the 
three bus structure, the MAP theoretically can 
simultaneously input into one memory, output from the second 
while doing computations on the third and never utilize the 


wet except for initialization. 


Ihe MAP has a separate instruction set for the Central 
System Processor Unit (CSPU), Arithmetic PRecesSsonr (Und © 
(APU), Addresser Processer Section (APS), and Host Interface 
Ecrol! EIS); Inasmuch as these processors work 
indeoendently, the instruction sets are not as complicated 
as тау have Deen necessary if operation was controlled 
motally from a central Site. The total number of 
instructions cer second attainable by the МАР- 500 is data 
dependent. whenever all steps necessary to perform the 
operation are completed, as witnessed by oroperly setting 
the correct flags in oseudo-memory (to be discussed later), 
the operation will poenucpmestocomplet1onD- While the 
meer ron/multiolication operaton is being carried out in the 
APU, preparation for the next мога (half-word) of 
information can be conducted in the unaffected processors. 
System flags are used to communicate between the processors. 
Месе flags include General Purpose flags available to the 
orogrammer for general system communication, Control flaas 
to control processor modes and operation sequencing, Status 


flags to indicate processor status and Hardware 
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En guration flags [13]. 


Ihe MAP-300 system installed for evaluation consisted 
of: the MAP-300 processor, WGeertoce with the РОР-11 
computer utilizing the RSX-11M operating system, C4K words 
of S00-nanosecond MOS master memory (8K for each memory): 
power panel, expansion chassis, installation; ГО Paes ver, 
SNAP-II algorithm library, cross assembler, simulator and 


EET The orice of the system was 544,500 (271. 


Meee ARACTERISTICS AND HARDWARE 


ШО СЪРУ 





[men central Processor Unit (CSPU) (fig 14) is the 
E mmang Central” of the NAP-500 array orocessor. The CSPU 
responds to commands from tne host, transfers data to and 
Eus the host, assists the APS in address calculations and 
loads tne orogram memories of the Arithmetic Processor and 
Moet Interface Module. The CSPU performs the functions of a 
Nunt-end micro computer to control tre» actions Of Metre 


System. 


РЕ TSS 8G fasta fixegeooint arithmetic unit for 
address calculations, an LAS teuctaion register, an eight 
Ms ter accumulator file and a priority interrupt network. 
It has access to the three main memories via tne memory 
Dusses and suoolies the other MAP processors with the 


program instructions they need from main memory. Reentrant 
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subroutines and multi-level indirect addressing are 
recognized by Се ШОРО» er nas ло э Сараев ту cut 
ЕО instructs the Host Interface Scroll (ог 1/0 Scroll) 
 Гегтогт input or outcut operations to or from the host 
eternal) devices). The CSPU wil] never halt but will 
always be inthe WAIT state after its instruction sequence 


ucccomoleted,., 


An imocrtant register in the CSPU is the Control 
Peers Register or C-State word (CSN). Ite SEG cH 
MS ter containina the status of orior operations, the 
program counter as well as the source and destination 
locations for block memory transfers. Fields “of the 
register can be combinea to give hardware condition codes 
ШӘР use in conditional operations, branches, ЖЕЛПІНЕ or 
executes. [PRemeowmealso sttoulates om which bus instructions 
or Gata are present and controls the interrupt responses for 


Beemer units. 


Tne СӘРІ) 15 бле огу orocessor atle m be 
interrupted in the МАР (otner processors can e1:ther Halt or 
EE and contains a o3 level interruot priority system witn 


one mterrupot device oer level and three lines per device 


EE 7 »5ossible comoinations). Тһе СРИ may only be 
ınterrupted between Inseruetions:e It wıll also nest and 
queue lower praority interruots т a higher DINO RA ty 


interrupt is preceived curing the servicing of a lower 


Briority interrupt- These interrupts are detected by 


12 





Eng and levels are polled only if they are above the 
current interrupt level. Lower level interruocts will 
continue to exist but will O be recognized until the 


ENSher priority interrupts are serviced. 


The CSPU contains no memory but uses main memory to 
Ж ГЕ ICS instructions. ahen fetchea, these instructions 
МОСС оге) іг the instruction register until execution. The 
CSPU may also address a pseudo-memory location called System 
ШОП Кесістег (ӘҮЗГІС) which is the orimary Inter-processor 
Fommunication system. by testing the bits of SYSFLG, the 
Со sense the status cf any of the other processors. 
(Pseugo-Memory refers to memory ohysically located within 
The sub-orocessors but which acoear on the bus aS a memory 


ВЕ similiar to the PDP-11/5368745/55/o00/70.) [18]. 
2% Arithmetic Processor 


ine Arpitmmetic Processor consists of two components», 
the Arithmetic Processor Unit (APY) ana tne Adaresser 


Meoeessor Section (APS). 
eis APL 


пе сше е Processor Unit” (APRW) (fio 15) is 
responsible for the computation reauired in array processina 
and executes programs relatively independent ої the other 
MAP processors, ouch mmceSUnderEtnesceneral*control of the 
ШІ) _ The APU consists of two adders, two multipliers (the 


main еее том between the МАР- 500 and the MAP-100 or 
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mere VO is that the former contains two adders and 
multipliers while the others contain only one each), 34 
various registers and three First-In fF irst=0ut (FIFO) 
buffers for Inoue and output storage, The two adders and 
ШОО multioliers permit parallel processinq of data to 
Шасгеазсе throughput. APU programs are stored in main MAF 
memory and ere sequentially olock-transferred to the APU 


ШИЕ ғат memory under control of the СРО. 


The main units of the APU are the arithmetic 
processors CAP! and АР2). Each arithmetic processor 
consists of an adder and multiplier that may operate 
Simultaneously ana indenendently of each other. Each adder 
ШЕ Тед by eight registers апа each multiplier by four 
СТБ сапа геа1ѕзїегѕ апа four multiclier registers. The 
results of the adder are rcuted to the result register H and 
mame multiolier SES in ra A Р, To transfer 
data between the separate arithmetic processors, an exchange 


 Шебістег іс ргоуісес. 


Nae mMoOGY Вест O two ¿ob=wora 160-5616 
EES-by-side memories. Tne memory is initially loaded by 
the CSPU from MAP memory and the APU is then out into nt re 
mum State. Instructions are sequentially decoded in the APU 
ШРегіогп the specified algorithm. The instructions are 
lo=cits for eacn board (API and AP2) and are executed in 
parallel. They can perform addition, morticlications 


transfer Of data апо Өле setting of flags. Tnese 
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instructions are decoded and the operation started as soon 
as all necessary conditions are met. Immediately, the next 
instruction is retrieved and decoded and attempts to be 
executed. If either the P/R register psu uolweg та а 
EEUU elıcation/addition operation which has пої yet Deen 
completed, the Input Queue[(IQ) is emoty or the Output Queue 
MINAS full, the APU w1!! go into a "watt" state. It will 
remain in this EE state until the 
melo! ication/addition instruction is comoleted or the 
other conditions are satisfied. There is a problem that can 
EET due to tne sids-by~side 16-6107 memories used for 
program storage. Since there is only one ER 
Me API and AP2 processors work in parallel the sids- 
Euge memory acts as two halves of a 32-bit instruction 
register. Therefore if one board (AP1 or APe) is forced to 
wait, the other must also wait since the next instruction 
may not be retrieved until the proaram counter can бе 


incremented. 


ive si pout Neues а feur=deeo FIFO buffer which 
Mmemevrces both АР! and APe. To get the next input data 
field, the IQ must be advanced before the data is 
transferred. I oeth Coords Request data Without advancing 
the queue, they will receive the same data, which may be 
ea for certain applications. If they both simultaneously 
VS TG advance the [Q, it wil] advance only once and give an 
API priority, then advance the second time after the 


transfer nas been completec to aive data to АР2. 
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There are two Output Queues each of which 1S a 
four-deep FIFO buffer. These queues allow maximum capacity 
of the adder and multiplier to be utilized, since it is less 
likely that the processor will have to wait for either 
Butter to have a vacancy due to a busy bus system. If both 
processors try to act оп any single OG, orocessor AP! will 


Ше Given the priority. 


А tyoical multiplication takes approximately six 


< (920-папозесопоа$) and a typical! ada taxes about three 


cycles (210-nanoseconds). Therefore; tO increase 
DOU, "hiding" adds, moves, etc. behind multiplies 
will accomolish operations in the time it takes to do the 
multiply alone. TInesemestsettieient method to program the 


MAP-500 is to treat successive samole sets in alternate 
processors; this effectively o roduces a multiply every 
el0=-nmanoseconds. Since there is one input aueue, this 
perhogd allows both to nave access to the same information 
(бу not incrementing the queue) and also gives a greater 


ance to use hiding effectively. 


The APU can uSually operate in two modes. Mode 
One, the normalized тосе, can either use normalized or 
unnormalizea floating-point numbers as ОРОС тген the 
results being a normalized floating-ooint number. Using 
unnormalized floating-coint numbers as inout can lead to 
precision loss since the normalization process wil] shift 


the mantissa to the left (values less tnan in) or to the 
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right (values greater than 1.0). The vacancies created by 
these shifts will be filled with zeros, which, after 
memeutationr, could possibly produce an unusual mpuncagtion. 
The unnormalized mode will accept unnormalized numbers as 


ОЕ amd w1ll return unnormalized numbers as output [18]. 


me APS 


The Addresser Processor Section (APS) о 16) 
ШОШО ес both the adoress in MAP memory for the location of 
input Gata words to be processed by the APU and the МАР 
memory addresses OR TCE OULU from the APU. It operates 
indeocenaently of otner processors, within status ana control 
ME constraints of  SYSFLG. Ше лро contadas а !c85-word 
NEN Umemonrv, four orogram counters (two for read and two 
ШОП write), eight address buffers (to be used as inputs to 
the adder), По nst=Im- First Оо GHim@). butters, “an 
arithmetic ШОС ус CURIE (adder), апа u3ssoc)ateg logic and 


momero! units. 


Тверь огосгат5 are stored in МАР main memory 
Pea are loaded by the CSPU. Certain absolute adcress 
топ must oe known to a APS proaram at run time which 
En not available during proaram writing. The assembler 
computes them at assembly time and the CSPU inserts them 
EN the oroocer location curing this orogram transfer. The 
EU then initiates APS operation by setting the proper 
flags. The APS may be loaded with new information by the 


CSPU during run time by cycle stealing, thereby not causing 
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tne APU to slow and wait for a value in the IG or a space in 
the 00. Because the instructions in MAP memory are Se-bits 
Mañana the APS instruction is only ¿S-bits long, the seven 
bits left over are used to store the APS memory address for 
meee instruction. This allows the CSPU to increase 
Enrouchput by immediately зао аеру гос стоп into 


ED rect location im a pre-computed order, 


The adder comoutes adaresses dependent on prior 
computational results, literals or specified increments. 
 Ә(г1ге5с асдісісп ana subtraction is considered to he 
modulo С xx 17 so tnat oniy oositive addresses in that ranae 
will be computed. Results are queued in either the Read 
ее FIFO (RAF) or Write Address FIFO (WAF). Along with 
the address is a code to delineate whether the adaress 15 
EN Vord, nalf-word or byte (oair of bytes in a lo-bit half 
wora adaress) and if |t is a eiaht-bit Mie =oo1nt number, 
INS? xea=po1nt number, 16 bit floatina-po1nt number or a 


Est floatino-boint numcer. 


ittemerstimetivesteature of the APS is that there 
Baur program counters (РО, Pl, P2 and P5). These allow 
UT Separate orograms to te stored in the APS and executed 
in ап interleaved manner. Sequencing of these programs 15 
Euwurolled bv the status of the WAF and RAF in conjunction 
EEUU the APS instructions. These oroqram counters also 
вое а |оорітго ability allowing the APS to work with the 


ШЕЕ interface Scroll or I/O Scrolls to keep data fiowina. 
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After one memory has been processed and reloaded, the APS 
need not be Peimircieted But can continue operation on the 


EVN Саа Oy this looping feature [18]. 
most Interface Scroll 


The Host Interface Scroll UIS) wcomswstse of two 
Subsections, the Host Interface Module (HIM) (fig 17) which 
NO cateo 1n the MAP-300 and the Host Interface Controller 
MEA Rich is located in the host memory. cans 
Interface Module transfers МАР programs, unorocessed data, 
host status and Host Interface Controller commanas from the 
host to tne MAP. Processea data, MAP status and processing 
commands are also transferred from the MAP to the host via 
the HIM. A programmable scroll orocessor ‚sszerroviıcdee’-.for 
КООШО ЛО МАР and host memory locations durina a Direct 
Memory Access (DMA) operation. Other pertinent aevices 
include a memory-ous interface, controllers for host memory, 
format conversion haraware, status and control DOC along 


EU m"terruot logic. 


Тһе НІС controls the handshaking necessary between 
Bee host and the MAP, The nandshaking consists of interrupt 
ЙЕСІС from MAP to host and logic necessary ог соп гс ling 
E transfer of data with either Direct Inout/Output (DIO) 


ИСЛ 11 гу or OMA transfer (18). 


The host generally interruots the МАР to initiate 


Program sequencng. loce when the MAP is complete0a, it 
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ENNEUnitiate communication (interrupt) with the host OF 
airther work. When the interrupt is acknowledgdaea by the 


most, more data or programs аге sent to the MAP depending on 


the flags. Пе МАР ргосезъогс аге іа а loop operating 
on data suoclied from external devices and delivered to 
external devices via T70 Scrolls, the host awit not be 


interrupted unless there is an error. This frees the host 
ИШЕ ПО any other unrelated processing necessary.) The 
maximum response time to initiate an interrupt (е NS 
meroseconds for the HIM and 250 microseconds for a user 


в гочетте [35]. 
4. Memory 


Main memory ОЕ SDUXEconsists of three 
independent  Dousses each having tne cacability of 256K words 
of 500-папоѕесопа MOS memory or o4K words of bivolar memory. 
Memory types may not be intermixed on any aiven bus but each 
bus may have a aifferent type from another bus. Memory can 
also be either master or Slave, master memory oeina usec to 
MONTO! program execution, arbitrate апа observe system 
protocol while slave memory stores the data. tach memory 
ENS containing memory 1$ required to have at least one 
master memory module (available in eitner 4K or BK blocks 


mor “OS or IK, cK, or 4K plocks for pıoolar). 


Access to each memory is via a common ous having 11 


ports amc WoNuPIUPIty levels. Three ports are reserved to 
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EUused with the absolute priority scheme leaving eight 
Sorts with a sequential mounsispoosmnmobpolwteJ. oriority 
scheme. Absolute priority is the highest ОЕСР у. апа 1S 
intended to be used with high speed minimally-bufferegd 
devices such as disc units or tape units where loss of data 
may reSult. Sequential Found=ro0in priority handling 15 
used for slower buffered devices and is a round-robin 
Ecwrcular) aueue which is checked each memory cycle. The 
device first 1n the aueue will get the next memory cycle. 
Scanning Гог the next queued device will commence 
immediately upon the orevious device startina tranfer. When 
the next memory cycle occurs the new device will be known 
keeping overhead minimal. Of these 11 ports, tne. en hoe enc 
CSPU each have one dedicated рог and the AP has two 
dedicated ports on each bus with seven ports remaining for 


the [0S and other uses. 


Psuedo-memory Calluded to earlier) is tne upper 4k 
words on Bus CONIA addresses of certain registers 
О for status and control. These registers are located in 
the Euwc-processors but apoear as addresses on the memory 
bus. Aanvosub-processor may alter the contents of these 
TIONS so it is important that the proarammer not try to 


Overwrite these addresses with programs or data (ld). 


mee SOFTWARE SUPPORT 
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ЕН (һе АР-1208, there are software routines to ата 


in program development and execution. 


Is. оосо Ve and Associated Routines 


—-———————————————————————Є—Є—Є—Є—-—-- 


а. Assembler 


The MAP-3500 assembler, written in ANSI Fortran 
IV, takes a source program written for either the CSPU, APU, 
APS, HIS or 105 and creates an executable ooject module. A 
ШЕГІП file and errors file cam also be createg. Editing 
ememupdating can be accomplished from tne last source file 
ША Спапсіпо and assembling only the incorrect line (Cor 
Bes) of code, thereby avoiding the reassembling of the 
entire program LISI. The assembler will also allow change 
of the fice memory to’ enable it to handle necessary 


Euftftering. 
B Simulator 


The MAP Simulator Program simulates mode! 200 
ВИ“ model 500 processors by executi'ina MAP object code. The 
Emulator oermits the programmer о ы еле! ср ог debug 


Et ware off-line so as not to disturb production schedules. 


Шие МАБ Simulator Program has the cacability о? 
EE U!3t:ng the operation of the APU, APS, CSPU, Memory апа 
Mee interrupt handler. It nas not been updatea to handle 
certian пем commands апа flaas (listed in the front of 


МЕ [25] ) пог does it nave the ability to simulate the APU 
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Best mode. Memory size апа tyce can be specified either in 
the initial ao бзе simulator or while гопптпа to 


EET it for current or proposed configurations. 


When used as a debugging aid, the MAP Simulator 
ШЕӘагат allows the operator to: install breakpoints and 
execute macro instructions at these breakpoints, detect 
EB Sram errors and execute macro instructions after their 
discovery; examine reaister contents; FUN programs from 


different processors (APU, CSPU, etc.) independently; and, 


Euch loaded proarams. Inout/outout may be obtained from. a 


terminal, orinter, tapelmagnetic or paper), cards or 
cassette. A batch mode is also available. Actual program 
timing can oe estimated by installina breakpoints and 
iPmeamvrdually timing sma}! sections of code [25]. 


с. Loader 


The MAP Loader 175 97 Ontrcan program. which 
Seceots object code produced by the AÁssembler ana create 
Mocks Of binary code in MAP machine lanauage. Кал Соче is 
transmitted to the MAP memory via the МАР driver through the 
Host la ectace —5croll. aie OS in transmission are 
detectable since check-sum digits are transmitted to the MAP 
Bong with the blocks of code. Тһе Merge operation creates 
and updates the tables and adaresses necessary if the loaded 


meagre 15 to be used with the SNAP-II executive [2201]. 
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dz Debua Package 


The MAP-300 diaanostic package is designed to 
uv hardware operations and isolate any malfunctions to a 
specific card. One module is resident in the host while 
another, hich сп ат ге Ге test modules and test programs 
necessary to determine proper system operation of the CSPU 
and other sub-processors, is present in the MAP. This 
are can run interactively or under batch orocessing 


ES]. 


The MAP-500 LOOK oproaram permits the programmer 
to examine MAP memory for oseudo=nemory) from any computer 
Ге о: operating under ANSI Fortran IV. This is also an 
ENersctive routine and provides the ability to "patch" 
coded proaram seaments or enter entire machine languaae 
Srograms. The proarams ог segments can then be stepped 


ШИЕОПап to examine the results closely (201. 


e. SNAP -II 


Systematic Notation for Array Processing Version II 
ОГ 5МАР-[[ 15 a single-command high-level тасго=уре 
Шапасасе used to program the MAP-500 array Processor. The 
2-11 package consists of a Host Support Module, Host/MAP 
IS module», SNAP-TI Executive, SNAP-II Function Modules 


EN mstallatwon test and Acceptance test Module [18]. 


Tne SNAP-1] executive cermits the user to define 


Euer size, апа the structure and location of programs in 
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MAP memory. The executive also structures the routines to 
operate at maximum speed by insuring that the maximum 
possible parallelism exists between sub-rorocessors (for CSPI 
written functions), несе accentuating 9 hiding". The 
SNAP=II subroutines are written in ANSI Fortran and passed 
memes MAP via Function Control Blocks (FCB). The MAP 
Driver, which is located in the host, directs the loading 
and operation of the orograms. (in a looo or "Map While" 
Epson the driver need only load and initiate the 


meaewence then return control to the host operating system.) 


т а \1сые | the ггсагаптег to build nis own 
ОООО ОП lists with the Fortran tvoe statement "Map Begin 
aon List" (MPBFL()) whicn oermits the host to remain as 
МЕРЕ as possible from the operation of the MAP. Two 
dimensional arrays are demultiplexea by SNAP-II thereby 
Messina speed of execution in the orocessor oy not having 
Го compute two-dimensional ACeressmrstecuctures. SNAP =I I 
вос are callable from either ANSI Fortran or Host 
assembly language programs and are able to operate оп both 


оО complex data (151. 
ERES Programming Languece 


Nc mu unmnctsons are mot specific enough to 
satisfy the programmer's needs or if tney do not exist in 
the SNAP-II library, new routines may be written in an 
assembler type language. Iu opu ENePS-ang HIS each 


EN uem ownwWnstructions to ootimize each sub=processor's 
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Epail ities. 


uncccspPuUEstergctions are broken into ШОТ Pou Ss 


which have the ability to oerform all the functions that a 


general ourcose computer is normally visualized as 
Performing. They include: generic (performs interrupt 
System coding and looping); single register; move; logical; 


push and poo; ol За опе Газ пор s within 256 half-word 
MOS tons and a jump Can be to any new location); skio апа 
ШІП manipulation; comoare; and maintenance and test console 
ООШОТ С топ». [е АШК Сап perform: two-araument adder; 
single argument adder (like aporoximate reciprocal 
ПЕС Е тол); multiply; data transfer; jumo and call, and 
то! operation instructions. Ше” АБ сегтосто: load; 
address increment; register arithmetic апо соп го! type 
Eustruct1ons. The HIS reccanizes: single register; logical 
register; arıchmetie register; literal ang control 


Eusruction types [138]. 


тосе еасН 596 =ргосез5ог 15 designed to perform a 
special operation and can be programmed to optimize that 
desıan, the overall oerformance of the system 15 increased: 
Me Dnocessors perform 1n parallel and stay in "sync" by the 
use of flaas. À sub-processor will wait until the proper 
Ши 15 Set before continuing, thereby insuring integrity. 
The waiting also relieves the proarammer of HC OUNL 
est with No Operation (NOP) instructions which could 


possibly cause lost data. The orawback is that he does nave 
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 Іпсгезсзес соптпо!Техісу бу inSuring that proper flaas are 
set at the proper time [lol]. Most of these encumbrances are 
eliminated by the executive however. Flags are available in 
oseudo memory and are easily tested. The complexity issue 
КОШО та! Since for most applications only АР) and APS 
routines need be written. Only under soecia) circumstances 


АЕ РИ or HIS routine required. 


Pseudo-overations are also available to ease the 
programming burden. They perform such tasks aS naming 
Mac ter strinas, insuring that information is olaced into 
memory on а word poungGary, generating constants and making a 


NE Atrol Status "ord (CSW). 
A. ШОО Scrolls 


Eum WU scrolls (105) control mlock-transfers to or 
from external verioheral devices (inclucina other MAPS) 
without interferring with tne MAP-500 processing cycle by 
Using а suo-orocessor which can oe ore-programmed. Ihe IOS 
Contains three functional elements: orotocol logic necessary 
Потоп тег? асе the external device directly to the MAP-50U 
memory busses: a programmable orocessor to comoute MAP 
addresses and issue control signals? ands tne transfer logic 


Meeessary to interface with oeriopheral devices. 


tere oare five basic 105 models. TOS also known 
aS the Balnmzemance апа test console, ШӘ сасавте оғ 


transferring eight-o1t single words to MAP bus number one at 
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Emeric rate. I0S2 has two transfer rate options and two 
GSE Size options available. Word size option one utilizes 
mmemonock-tCransfer of 8 or lo-bit words to any of the three 
МАР busses while option two uses either 16 or 32-kit words. 
Transfer rate option one conveys information at a 1 MHZ rate 
Р  npared to the 2.5 MHZ rate of option two. Either 
transfer rate option may be combined with either word size 
options; however, only one combination is available at a time 
Since they are hard-wired. Under crogram control, 1055 сап 
Nas ter either 16 or 32-616 words to any of the three 
Misses at a 750 KnZ Sustaired rate. 1053 can also perform 
format conversion, monitor data with a basic operation 
Sim liar to the HIM and sucoort indirect addressing. IOSAÀ is 
mam speed (up to 40 MHZ) scroll, allowing block transfers 
ОТ 8, 16, 22 or 6ü-bit words to any bus (64-bit worcs 
MUSE De transferred simultaneously to bus d and bus 3). 
Bu 53lso allows Packing and buffering Of data E TOSS 


is a airect memory-to-memory bus-connect ootion for direct 


data transfer between user devices and the MAP-300. The 
wne Pequires no software (and wil) not supoort software). 
Nc oDerat:on is controllec by hardware ano three interrupt 


request lines [211]. 


а. Analog Vata Acauisition Module 


The Analog Data Acquisition Module mogel Se 
(ADAM-5120) 1$ а programmable analog interface capable of 


Hucepbprong from 2 to 16 channels of analog information. this 
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mepormation is then digitized to 12-bit resolution at a 270 
EU throughput rate ZEE I MN TTchammelwecase (1e KHZ for 
single channel). ИСОЕВ ЕНЕ Юа бего 5, the A/D operation 
Ecke Olace simultanecusly with the MAP-500 processing. 
EN DM 3S functionally ecuivalant to the I0Se with only 
dded analogrtordigital circuitry. This allows the ADAM to 


EESSNAP-TI comoatible. 


M coUo neMcfothe ADAM is carried out via а 
EN US to io sample-anc-hold units which then make their 
sıanals available to a 16:1 multiplexer. tach channel of 
the multiclexer 15 the consecutively Sampled by the A/D 
converter which outputs either a 16-61% sigan-magnituoe ог 
foot Meot ina oont number. Performance accuracy is 


МЕЕ еа а 0.20 oercent of full-scale resolution [2]. 


Pere RUGRAMMING, OPERATION AND EXECUTION 


The MAP-300 can not only utilize parallel operations of 
the adder and multiplier in the APU, but also the parallel 
EXC PoOcessor operation of the APS, HIS, 105, APU and CSPU 
 ІПпсгвазе tota! throughput. Тһе programmer, оу breaking 
the problem into smaller independent proarams of addressına, 
arithmetic, 1/0 and management, can theoretically more 
EE v proaram the entire broblem than ОУ adherring to 
internal Communication со осо Эна -flags ШЕШ» [he 
respective proarams should be easier to write with much of 


the increase in overhead due to the added handshaking and 
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protocol requirements being assimulated by the executive. 


[16]. 


CSPI recommends that a modified top-down programming 
Pechmiaue be used initially by writing the APU routine first 
to Insure the optimum execution speed. Then aading the other 
necessary routines (generally just Te огош пез) to 
insure the information is present when the АР) needs it. 
The APU shoulda be orogrammed to treat subsequent sample sets 
Іп alternate adder/multiolier modules and arrange data so 
плат ЖЕ) тапу adds cam be "hidden" as oossiole [18]. By 
proper execution, sequencing total time can be shortened to 
equal Ше time to multiply only, with all other operations 
ее» under these multiclies. This "hiding" operation 
becomes easier in the МАР-500 than in the АР- 1208 since 
Cycles need not oe counted and NOP's neea not бе inserted 


for unused cycles due to flags being set to signal the 


avallability of resources IE The programmer must be 
aware that the timina is not absolute, therefore the 
executive will tightly control synchronization oy flags o 


insure one adaer/multiplier does not aet anead of the other. 


men orograms are initially loaded from the host to the 
ШЕ Ута the operating system interface and driver. The 
meyer. AL routine makes the stanaard interface tnrough the 
meemating Systm and MPDRV.MAC makes the MAP appear as a 
standard  PSX-11M device to the Computer. Initial 


Femmunication from the host to the МАР 15 done via a four 
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Mora Driver Control Block (DCE) 26167 When the Central 
Е Processing Unit is initialized by the host, it will 
load the other sub-processor programs and commence program 


execution. 


Subsequent MAP commands are sent to the MAP from the 
rs Function Control Blocks (FCB) which reauire host 
intervention to send. (Function lists and the MPWHL macro 
treat multiple ҒСВ!5 as a sinale entity). These FCb's 
transmit host to MAP status, terr roots Tand functions. to 
perform and can be cueued in tne HIS buffer. when it is no 
longer necessary for the host to send or receive a FCB, 158 
can perform other operations 557. Therefore, with 
стей use of the IQS and the possibility of stringing 
MAPS in series, the host can be free to either oerform other 


ESO act as a system monitor. 
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I DISCUSSION OF FINDINGS 


In the test bed, the PDP-11/34 was chosen to perform 
the mont end fumetions which consisted of buffering the 
ШЕНІ, tOrmatting it and then passing it to tne array 
Ecessor or mass Storage device (or from the mass storage 
ШИИИҒЕЕС о the array processor). This limited front-end 
Being function did not dictate that the computer be 
large. me choice of the | PDP-11/35 computer for this 
application seems adequate. TE BOT II2 0O4 would mormal ly 
contain enough speed to handle the necessary operations but 


may oe NS a timer actory since YE does mol have а resıdent 


memory control and protection routine to ease the 
programmers burden ana help insure system integrity, nor 
Est contain the 2^ cache memory to increase Speed. A 


computer Мәгсег шәп the PDP-11/34 may not increase the 
efficiency of the system although it would increase the 


cost. 


TES Ded utilized the PDP-11/70 for the output 
computer. The output comouter would be required to receive 
information from the array processor, manipulate the data 
store it for future display on one or more gevices. For 
this application, the PDP-11/70 seems best for several 
reaSons. The system is much like the 11/34 except that the 
Current maximum memory 1s 2 megabytes to allow for better 


Miceli zation ot information. There are dedicated paths to 
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hiah performance storage devices that would allow more 
 багтоп to be processed per unit of time. lo further 
Bess arrays for outout, there is a 32-bit ог а 64-bit 
floating-point arithmetic unit available. ME Em YU 
gives larae-computer performance and expansion capabilities 
ith the cost and space requirements of smaller units 1511. 
Using tne same manufacturer for the output function as was 
used Br the input function reduces interface oroolems and 
contributes HN Е его слепсу Of the programmers by 


increasing overall knowledoe of the architecture. 


ғ 


The proposed test bed uses of the 11/34 and 11/70 can 
ге у modified by the choice of the array orocessor. 
ШОО TS 500 utilizing an Analog Data Accuistion Module 
E277 170 Scroll can eliminate the need for the input 
humetions етого 16 channe! analog-to-oigital 
Conversion) ОСОБЕ оке регате опа the 11/70 (or oossib!y a 
ES Costly model) too perform input, output and monitor 
ШОС опе їп the test bed. cac Ete fU will probably 
be larae enough and fast enough to facilitate combining all 
SubSystems, except the display subsysten, under one 
 ЕМгег. The 11/34 ana 11/70 combination shoula provide 
wore the full range of computers necessary to properly 
emulate and evaluate just how much comouter caovability will 


Sy Oe needed for any soecific apolication. 


Me "Question arises as от сп 1$ the best array 


processor tomthecagdoplicat)!on. lem pU Ss sSywnchronous: 
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therefore some may Say safer, has a 38-bit word which could 
mean greater accuracy, more standard library functions (such 
as vector log base 10 and vector log base e) and a 3500 hour 
mean time before failure. The MAP-300 is a newer system 
which, due to the minimal host involvement, three separate 
busses, 1/0 Scrolls and the ADAM, can proviae areater long 


TT Throughout and more flexibility. 


For the non real-time environment where simple 
programming and host involvement can be tolerated, the 
ОВ may be a good choice. OCN Toros iae facilities. to 
ШОШО alaorithms to specific needs? tnese facilities are 
Met too complex to tax the normal Programmer. However, 
new programs cannot be addec directly to the AP math library 
EH) out must be linkec and loaded for every usage as 
m. any application program. This creates an excessive 
time overhead. Therefore, the AP-120b should be used only 


where simplicity and ease of use are paramount and utility 


boe sacrificed. 


Boge applications requiring real-time computations 
(which tne test bed most ke Lymer il eventually. demand ) 
innovative desian, high throughput rates and generally 
greater flexibility; the MAP-300 provides the answer. The 
improved performance of both array-processing potential and 
computer availability is offset by the increased cost of 
Orogram development if non-lidrary routines must be written. 


These Foutines however may се added to the library 
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effectively reducing overhead. Reference [23] reports that 
the MAP-300 also complies with MIL-E-18400, MIL-E-5400, 


EU LSID»-401A, MIL-STD-704B ana MIL-STD-1399, 


During the installation of the MAP-500 at the Naval 
Postareduate School, it was noted that the installation 
documentation was extremely poor. As of this writing, three 
weeks were required to install the system. This was due 
mainly to the poor documentation in the installation packaae 
received with the Ung ts Not only was the package 
incompiete, but changes to the software were performed that 
were not Changed in the criginal documentations, nor was an 


eratta sheet provided. 


Tt iS realized that for many companies involved in data 
processing equioment manufacture» documentation 15 not a 
EN Concern. However, CSPI seems to nave far inferior 
Installation documentation than would reasonably be 
expected. MISS шышт оп асе it impoessiblie to do a good 
test of the system operatıon but allowed only a cursory 


reviewe 


Even with the evident shortcomings) of the documents, 
theoretically the MAP-300 is far superior to the AP-120B. 
INE "I would upgrade their documentation and cerform the 
masealiation at the site, their sometimes negative public 
image could be eliminated and confidence in tneir equipment 
could be increased. It must be noted however that ref [18] 


БЕСІ Се publication "Simple Notation For Array Processing, 
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Version II, Reference Manual", e сеу игеп. 
There, fore in the Ol OIG discussion» the use of the 
MAP=-300 will be assumed. БІ now look at each subsystem 


closely and attemot to determine alternate designs. 


The analog subsystem obtains data from опе of four 
Sources: time code read/generator, 14-Сгаск recorder 
Money well 96), signal synthesizer (Rockland 5100) and/or а 


noise generator (HP 37224). Uo to 128 channels of input are 


amplified, sent through a programmable matrix Switch 
ЕП in 52-сһаппе] output signals to a proarammabdle 
Se=channel filter. These analog signals then leave the 


analog subsystem to be input So Еле’ этапа! processing 


subsystem. 


The  AN-5400 analog-to-digital converter performs а 
l2-0it A/D conversion and is then loaded the Amoex Megastore 
mass storage device throuah the PDP-11/34 computer. [^e 
output of the array processor will then oe sent to the data 


processing subsystem. 


I sugaest it may be easier, more flexible and cheaper 
to inout the 32 channels as before to the orogrammaole 
filter, but tnen the 32 channels may be better handled by 
ЕШ (паіса Data Acquisition Modules directly into the MAP 
hommerocessing or via an I/U Scroll, model 3, oe sent to the 
ШШЕ 1/70 Storage devices for future use. This wilt 
eliminate the expense of the A/D converter, Amoex Megastore 


БЕ (Пе PDPR=71759 but more important, it will be relatively 
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Ec perform calculations in real-time. Опсе the MAP-300 
is started, EN onEDenrftomem" without host intervention until 
interrupted and witn an assumed input of 40 KHZ, the system 
should not be taxed. The cutput of the MAP can then be sent 
directly to the data-processng subsystem. The entire system 
can also be less complex, affording easier system 


develooment. 


Sime that a fictional system with a 40 KHZ Inout 
requires a FFT and discrete dicital filter to be aone on the 
ета тог. Пете топа o a 1024 real to slo complex 
E T transform requires 3.0 milliseconds i25] and a 40 
BEIM oUt rate would require 59.1 FFT's per second on the 
Ге. This would consume 117.3 milliseconds and assuming 
a 50 percent overhead yielc 175.95 milliseconds to perform 
БЕ FouUrier transform. Discrete filtering would require 
ЕЕЕ ШЕП 59.1 > ( 1021 * (2 * 500 nanoseconds + 12 х 70 
Eueconds)) or 75.6Г milliseconds. Again assuming 50 
Semeent overhead, 110.51 milliseconds would be necessary for 
"Ww сегіпо. re "total time consumed by the two functions 
(Ге 286.5 milliseconos, leaving 7/13.5 milliseconcs for 
other work. (Fifty percent overhead is an overestimation.) 
Шипа data into the MAP-300 would be hidden behind the FFT 
tion (except for the initial case) and would not 


contribute to overall execution time. 


оне тесстуе у eliminate the entire signal- 


processing subsystem with th exception of the МАР- 500. Tne 
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ПИЕ 70 computer in the data processing subsystem could 
Gomer | Беба опа with its other intended function of 
Somerotliing the display subsystem. Any storage necessary 
for output or any taped inout data could be handled by the 
tapes and disks associated with the 11/70 and execution 
could be performed on the МАР- 500 along with the above 
calculations. However, for expanded utilization, not 
specifically adaressed, the above use of only one MAP and no 
РОР-11/%84 may have to he modified to accomodate the new 
requirements if these new requirements are significantly 


larger. 


EN fter extensive testima the MAP-300 proves to be too 
costly due to unreliable software, the AP-120B can perform 
the same functions althouch st an increased hardware and 


time cost. 


For example, in the AP-1208, to perform the above real 
РИ есопріех FFI, it requires 5.08 milliseconds for the FFT, 
0.8 microseconds to rescale and 1.7/7 microseconds to reformat 
result for a total of 5.09 milliseconds per i024 sample 
То this must be added 100 to 1000 microseconds 
overhead for each of the four call statements: Get data 
ШЕШ (пе AP=120B(APGET), Put data into the AP-120BCAPPUT), 
real to complex ЕЁТ(ЕРЕТ) and real FFI scale and 
Permat(RFFTSC). I will use the arithmetic average of 9550 
microsecends ver call for an added 2.2 milliseconds 


респістпа in a subtotal of 7.29 milliseconds oer FFT. APPUT 
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and APGET have no specific times in ref [8], but according 
to Floating Point Systems the PDP-11 interface transfer rate 
КОНУ This would therefore reauire approximately 2.67 
milliseconds for each 1024 element transfer giving a total 
Pee oe milliseconds each for 39.1 FFT's, this results in a 
389.5 millisecond execution and transfer time. Again, 
allowing for 50 oercent overhead safety marain, the total 
becomes 574.16 milliseconds рег second. To perform the 
Este filtering would require an additional APGET, APPUT, 
ВЕЕТ, RFFTSC as well as a vector multiply(VMUL) and a 
ых vector multioly(CVMUL) bringina the time to compute 


one seconds worth cf data to well over one second. 


Therefore another AP-120B must be installed to insure 
that speed requirements аге net. Also, since the host 
computer must be interruoted many times, it тау бе necessary 
to retain the PDP-11/53 in the s3anal processing subsystem. 
There is also the consideration that if а math routine 15 
custom written, it will not be able to be loaded in the math 
library which will generate considerable overhead each time 
it is called. (The amount of this overhead time is system 


dependent.) 
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Gu coU LUPO CcANDCRECOMMENDATIONS 


The test-bed as prooosed seems to be a workable gesign, 
although tor most applications a more efficient and 


РФопісаі!і archıtecture may be constructed. 


For many uses the neec for the PLP-11/34 computer and 
the АМ-5400 A/D converter seem unnecessary when used in 
ООО ЛС оп with the МАР- 500 array processor. The Ampex 
Meaastore may be requirec for a few applications but would 
МӘС бе suitable for the majority of apolications (including 
real-time) since a disk oerioheral attached to the PDPFP-11/70 


Would be cheaper and still perform the same functions. 


It is felt that the increase iN comolexity and possible 
@omtusion usına tne MAF-300 over the ЛР= 206 Тап be 


overshadowed by the reduction in equipment recuired by the 


Poe o00. Ihis meres sed proficiency Should even te more 
areatly felt (assuming a norma! learnina curve) with 
Subsequent installations. A1 5o0, with the time savina in 


execution, extra calculations could be performed on the MAP 
іп a real-time environment, thereby increasing efficiency, 


Beerability and soectrum. 


Dezsıszrseommengeg that further tests be conducted using 
the cet applications, data types and speea requirements 
а чае the most economical and efficient MINIMUM 


design necessary. 
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